Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study

نویسندگان

  • Nathan Schneider
  • Behrang Mohit
  • Kemal Oflazer
  • Noah A. Smith
چکیده

“Lightweight” semantic annotation of text calls for a simple representation, ideally without requiring a semantic lexicon to achieve good coverage in the language and domain. In this paper, we repurpose WordNet’s supersense tags for annotation, developing specific guidelines for nominal expressions and applying them to Arabic Wikipedia articles in four topical domains. The resulting corpus has high coverage and was completed quickly with reasonable inter-annotator agreement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Semantic Constraints for Estimating Supersenses with CRFs

The annotation of words and phrases by ontology concepts is extremely helpful for semantic interpretation. However many ontologies, e.g. WordNet, are too fine-grained and even human annotators often have disagreements about the precise word sense. Therefore we use coarse-grained supersenses of WordNet. We employ conditional random fields (CRFs) to predict these supersenses taking into account t...

متن کامل

SemEval-2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

This task combines the labeling of multiword expressions and supersenses (coarse-grained classes) in an explicit, yet broad-coverage paradigm for lexical semantics. Nine systems participated; the best scored 57.7% F1 in a multi-domain evaluation setting, indicating that the task remains largely unresolved. An error analysis reveals that a large number of instances in the data set are either har...

متن کامل

A Corpus and Model Integrating Multiword Expressions and Supersenses

This paper introduces a task of identifying and semantically classifying lexical expressions in running text. We investigate the online reviews genre, adding semantic supersense annotations to a 55,000 word English corpus that was previously annotated for multiword expressions. The noun and verb supersenses apply to full lexical expressions, whether singleor multiword. We then present a sequenc...

متن کامل

Detecting Minimal Semantic Units and their Meanings (DiMSUM)

This task combines the labeling of multiword expressions and supersenses (coarse-grained classes) in an explicit, yet broad-coverage paradigm for lexical semantics. Nine systems participated; the best scored 57.7% F1 in a multi-domain evaluation setting, indicating that the task remains largely unresolved. An error analysis reveals that a large number of instances in the data set are either har...

متن کامل

Ontology-Based Semantic Annotation of Arabic Language Text

Semantic annotation is the process of adding semantic metadata to resources. Semantic metadata is data concerning the meaning of entities and the relationships that exist. Semantic annotation cannot be performed without an ontology suitable for the task. In this research paper, we describe the design, implementation, and evaluation of a lexical ontology for Arabic semantic relations. The main p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012